CSCI 2980 Project Report Data Migration from S-Store to BigDAWG
نویسندگان
چکیده
From spring 2016, I've been working with Prof. Stan Zdonik in a project about data migration from S-Store to BigDAWG polystore system. S-Store, which built on top of H-Store, is the world's first transactional streaming database system. S-Store maintains all the transactional support in a traditional relational database, while it supports streaming processing which is needed in the real-time applications. BigDAWG, which built on top of a variety of storage engines by MIT, is a polystore system provides cross-system querying, exploratory analysis, real-time decision supports and complex analytics. S-Store will serve as a front streaming engine within the BigDAWG, as well as a relational in-memory database in the relational island. In either case, the data within S-Store need to be migrated to different storage system in BigDAWG. This report first introduces these two systems, and then discusses the motivation and use cases for the migration process. Then it states the details about the decision and implementation, as well as the performance testing about the migrator implemented. In the last part of the report, I also proposed some observation and works that can be done in the future.
منابع مشابه
BigDAWG Polystore Release and Demonstration
The Intel Science and Technology Center for Big Data is developing a reference implementation of a Polystore database. The BigDAWG (Big Data Working Group) system supports “many sizes” of database engines, multiple programming languages and complex analytics for a variety of workloads. Our recent efforts include application of BigDAWG to an ocean metagenomics problem and containerization of Big...
متن کاملThe BigDAWG Architecture
BigDAWG is a polystore system designed to work on complex problems that naturally span across different processing or storage engines. BigDAWG provides an architecture that supports diverse database systems working with different data models, support for the competing notions of location transparency and semantic completeness via islands of information and a middleware that provides a uniform m...
متن کاملDemonstrating the BigDAWG Polystore System for Ocean Metagenomics Analysis
In most Big Data applications, the data is heterogeneous. As we have been arguing in a series of papers, storage engines should be well suited to the data they hold. Therefore, a system supporting Big Data applications should be able to expose multiple storage engines through a single interface. We call such systems, polystore systems. Our reference implementation of the polystore concept is ca...
متن کاملCSCI 1760 - Final Project Report A Parallel Implementation of Viterbi’s Decoding Algorithm
This report describes parallel Java implementations of several variants of Viterbi’s algorithm, discussed in my recent paper [1]. The aim of this project is to study the issues that arise when trying to implement the approach of [1] in parallel using Java. I compare and discuss the performance of several variants under various circumstances.
متن کاملData Ingestion for the Connected World
In this paper, we argue that in many “Big Data” applications, getting data into the system correctly and at scale via traditional ETL (Extract, Transform, and Load) processes is a fundamental roadblock to being able to perform timely analytics or make real-time decisions. The best way to address this problem is to build a new architecture for ETL which takes advantage of the push-based nature o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016